摘要 :
Prior research into search system scalability has primarily addressed query processing efficiency [1, 2, 3] or indexing efficiency [3], or has presented some arbitrary system architecture [4]. Little work has introduced any formal...
展开
Prior research into search system scalability has primarily addressed query processing efficiency [1, 2, 3] or indexing efficiency [3], or has presented some arbitrary system architecture [4]. Little work has introduced any formal theoretical framework for evaluating architectures with regard to specific operational requirements, or for comparing architectures beyond simple timings [5] or basic simulations [6, 7]. In this paper, we present a framework based upon queuing network theory for analyzing search systems in terms of operational requirements. We use response time, throughput, and utilization as the key operational characteristics for evaluating performance. Within this framework, we present a scalability strategy that combines index partitioning and index replication to satisfy a given set of requirements.
收起
摘要 :
The highly dynamic nature of online commenting environments makes accurate ratings prediction for new comments challenging. In such a setting, in addition to exploiting comments with high predicted ratings, it is also critical to ...
展开
The highly dynamic nature of online commenting environments makes accurate ratings prediction for new comments challenging. In such a setting, in addition to exploiting comments with high predicted ratings, it is also critical to explore comments with high uncertainty in the predictions. In this paper, we propose a novel upper confidence bound (UCB) algorithm called LogUCB that balances exploration with exploitation when the average rating of a comment is modeled using logistic regression on its features. At the core of our LogUCB algorithm lies a novel variance approximation technique for the Bayesian logistic regression model that is used to compute the UCB value for each comment. In experiments with a real-life comments dataset from Yahoo! News, we show that LogUCB with bag-of-words and topic features outperforms state-of-the-art explore-exploit algorithms.
收起
摘要 :
The highly dynamic nature of online commenting environments makes accurate ratings prediction for new comments challenging. In such a setting, in addition to exploiting comments with high predicted ratings, it is also critical to ...
展开
The highly dynamic nature of online commenting environments makes accurate ratings prediction for new comments challenging. In such a setting, in addition to exploiting comments with high predicted ratings, it is also critical to explore comments with high uncertainty in the predictions. In this paper, we propose a novel upper confidence bound (UCB) algorithm called LogUCB that balances exploration with exploitation when the average rating of a comment is modeled using logistic regression on its features. At the core of our LogUCB algorithm lies a novel variance approximation technique for the Bayesian logistic regression model that is used to compute the UCB value for each comment. In experiments with a real-life comments dataset from Yahoo! News, we show that LogUCB with bag-of-words and topic features outperforms state-of-the-art explore-exploit algorithms.
收起
摘要 :
Web search queries issued by casual users are often short and with limited expressiveness. Query recommendation is a popular technique employed by search engines to help users refine their queries. Traditional similarity-based met...
展开
Web search queries issued by casual users are often short and with limited expressiveness. Query recommendation is a popular technique employed by search engines to help users refine their queries. Traditional similarity-based methods, however, often result in redundant and monotonic recommendations. We identify five basic requirements of a query recommendation system. In particular, we focus on the requirements of redundancy-free and diversified recommendations. We propose the DQR framework, which mines a search log to achieve two goals: (1) It clusters search log queries to extract query concepts, based on which recommended queries are selected. (2) It employs a probabilistic model and a greedy heuristic algorithm to achieve recommendation diversification. Through a comprehensive user study we compare DQR against five other recommendation methods. Our experiment shows that DQR outperforms the other methods in terms of relevancy, diversity, and ranking performance of the recommendations.
收起
摘要 :
Web search queries issued by casual users are often short and with limited expressiveness. Query recommendation is a popular technique employed by search engines to help users refine their queries. Traditional similarity-based met...
展开
Web search queries issued by casual users are often short and with limited expressiveness. Query recommendation is a popular technique employed by search engines to help users refine their queries. Traditional similarity-based methods, however, often result in redundant and monotonic recommendations. We identify five basic requirements of a query recommendation system. In particular, we focus on the requirements of redundancy-free and diversified recommendations. We propose the DQR framework, which mines a search log to achieve two goals: (1) It clusters search log queries to extract query concepts, based on which recommended queries are selected. (2) It employs a probabilistic model and a greedy heuristic algorithm to achieve recommendation diversification. Through a comprehensive user study we compare DQR against five other recommendation methods. Our experiment shows that DQR outperforms the other methods in terms of relevancy, diversity, and ranking performance of the recommendations.
收起
摘要 :
Reciprocal recommender systems refer to systems from which users can obtain recommendations of other individuals by satisfying preferences of both parties being involved. Different from the traditional user × item recommendation,...
展开
Reciprocal recommender systems refer to systems from which users can obtain recommendations of other individuals by satisfying preferences of both parties being involved. Different from the traditional user × item recommendation, reciprocal recommenders focus on the preferences of both parties simultaneously, as well as some special properties in terms of "reciprocal". In this paper, we propose MEET -a generalized fraMework for rEciprocal rEcommendaTion, in which we model the correlations of users as a bipartite graph that maintains both local and global "reciprocal" utilities. The local utility captures users' mutual preferences, whereas the global utility manages the overall quality of the entire reciprocal network. Extensive empirical evaluation on two real-world data sets (online dating and online recruiting) demonstrates the effectiveness of our proposed framework compared with existing recommendation algorithms. Our analysis also provides deep insights into the special aspects of reciprocal recommenders that differentiate them from user × item recommender systems.
收起
摘要 :
Reciprocal recommender systems refer to systems from which users can obtain recommendations of other individuals by satisfying preferences of both parties being involved. Different from the traditional user × item recommendation,...
展开
Reciprocal recommender systems refer to systems from which users can obtain recommendations of other individuals by satisfying preferences of both parties being involved. Different from the traditional user × item recommendation, reciprocal recommenders focus on the preferences of both parties simultaneously, as well as some special properties in terms of "reciprocal". In this paper, we propose MEET -a generalized fraMework for rEciprocal rEcommendaTion, in which we model the correlations of users as a bipartite graph that maintains both local and global "reciprocal" utilities. The local utility captures users' mutual preferences, whereas the global utility manages the overall quality of the entire reciprocal network. Extensive empirical evaluation on two real-world data sets (online dating and online recruiting) demonstrates the effectiveness of our proposed framework compared with existing recommendation algorithms. Our analysis also provides deep insights into the special aspects of reciprocal recommenders that differentiate them from user × item recommender systems.
收起
摘要 :
Exponential growth of information generated by online social networks demands effective recommender systems to give useful results. Traditional techniques become unqualified because they ignore social relation data; existing socia...
展开
Exponential growth of information generated by online social networks demands effective recommender systems to give useful results. Traditional techniques become unqualified because they ignore social relation data; existing social recommendation approaches consider social network structure, but social context has not been fully considered. It is significant and challenging to fuse social contextual factors which are derived from users' motivation of social behaviors into social recommendation. In this paper, we investigate social recommendation on the basis of psychology and sociology studies, which exhibit two important factors: individual preference and interpersonal influence. We first present the particular importance of these two factors in online item adoption and recommendation. Then we propose a novel probabilistic matrix factorization method to fuse them in latent spaces. We conduct experiments on both Facebook style bidirectional and Twitter style unidirectional social network datasets in China. The empirical result and analysis on these two large datasets demonstrate that our method significantly outperform the existing approaches.
收起
摘要 :
Exponential growth of information generated by online social networks demands effective recommender systems to give useful results. Traditional techniques become unqualified because they ignore social relation data; existing socia...
展开
Exponential growth of information generated by online social networks demands effective recommender systems to give useful results. Traditional techniques become unqualified because they ignore social relation data; existing social recommendation approaches consider social network structure, but social context has not been fully considered. It is significant and challenging to fuse social contextual factors which are derived from users' motivation of social behaviors into social recommendation. In this paper, we investigate social recommendation on the basis of psychology and sociology studies, which exhibit two important factors: individual preference and interpersonal influence. We first present the particular importance of these two factors in online item adoption and recommendation. Then we propose a novel probabilistic matrix factorization method to fuse them in latent spaces. We conduct experiments on both Facebook style bidirectional and Twitter style unidirectional social network datasets in China. The empirical result and analysis on these two large datasets demonstrate that our method significantly outperform the existing approaches.
收起
摘要 :
High utility itemsets refer to the sets of items with high utility like profit in a database, and efficient mining of high utility itemsets plays a crucial role in many real-life applications and is an important, research issue in...
展开
High utility itemsets refer to the sets of items with high utility like profit in a database, and efficient mining of high utility itemsets plays a crucial role in many real-life applications and is an important, research issue in data mining area. To identify high utility itemsets, most existing algorithms first generate candidate itemsets by overestimating their utilities, and subsequently compute the exact utilities of these candidates. These algorithms incur the problem that, a very large number of candidates are generated, but most of the candidates are found out to be not high utility after their exact utilities are computed. In this paper, we propose an algorithm, called HUI-Miner (High Utility Itemset Miner), for high utility itemset mining. HUI-Miner uses a novel structure, called utility-list, to store both the utility information about an itemset and the heuristic information for pruning the search space of HUI-Miner. By avoiding the costly generation and utility computation of numerous candidate itemsets, HUI-Miner can efficiently mine high utility itemsets from the utility-lists constructed from a mined database. We compared HUI-Miner with the state-of-the-art algorithms on various databases, and experimental results show that HUI-Miner outperforms these algorithms in terms of both running time and memory consumption.
收起